Goto

Collaborating Authors

 representation system


Understanding Multimodal Hallucination with Parameter-Free Representation Alignment

Wang, Yueqian, Liang, Jianxin, Wang, Yuxuan, Zhang, Huishuai, Zhao, Dongyan

arXiv.org Artificial Intelligence

Beijing Institute for Wangxuan Institute of Computer Technology, Peking University General Artificial Intelligence National Key Laboratory of General Artificial Intelligence wangyuxuan1@bigai.ai Hallucination is a common issue in Multimodal Large Language Models (MLLMs), yet the underlying principles remain poorly understood. In this paper, we investigate which components of MLLMs contribute to object hallucinations. To analyze image representations while completely avoiding the influence of all other factors other than the image representation itself, we propose a parametricfree representation alignment metric (Pfram) that can measure the similarities between any two representation systems without requiring additional training parameters. Notably, Pfram can also assess the alignment of a neural representation system with the human representation system, represented by ground-truth annotations of images. By evaluating the alignment with object annotations, we demonstrate that this metric shows strong and consistent correlations with object hallucination across a wide range of state-of-the-art MLLMs, spanning various model architectures and sizes. Furthermore, using this metric, we explore other key issues related to image representations in MLLMs, such as the role of different modules, the impact of textual instructions, and potential improvements including the use of alternative visual encoders. Multimodal Large Language Models (MLLMs) have been rapidly advancing in recent days Dai et al. (2023); Liu et al. (2023c;b); Zhang et al. (2023); Dong et al. (2024); Bai et al. (2023).


The Music Note Ontology

Poltronieri, Andrea, Gangemi, Aldo

arXiv.org Artificial Intelligence

In this paper we propose the Music Note Ontology, an ontology for modelling music notes and their realisation. The ontology addresses the relation between a note represented in a symbolic representation system, and its realisation, i.e. a musical performance. This work therefore aims to solve the modelling and representation issues that arise when analysing the relationships between abstract symbolic features and the corresponding physical features of an audio signal. The ontology is composed of three different Ontology Design Patterns (ODP), which model the structure of the score (Score Part Pattern), the note in the symbolic notation (Music Note Pattern) and its realisation (Musical Object Pattern).


The HaMSE Ontology: Using Semantic Technologies to support Music Representation Interoperability and Musicological Analysis

Poltronieri, Andrea, Gangemi, Aldo

arXiv.org Artificial Intelligence

The use of Semantic Technologies - in particular the Semantic Web - has revealed to be a great tool for describing the cultural heritage domain and artistic practices. However, the panorama of ontologies for musicological applications seems to be limited and restricted to specific applications. In this research, we propose HaMSE, an ontology capable of describing musical features that can assist musicological research. More specifically, HaMSE proposes to address issues that have been affecting musicological research for decades: the representation of music and the relationship between quantitative and qualitative data. To do this, HaMSE allows the alignment between different music representation systems and describes a set of musicological features that can allow the music analysis at different granularity levels.


Optimal Approximation with Sparse Neural Networks and Applications

Hong, Khay Boon

arXiv.org Artificial Intelligence

We use deep sparsely connected neural networks to measure the complexity of a function class in $L^2(\mathbb R^d)$ by restricting connectivity and memory requirement for storing the neural networks. We also introduce representation system - a countable collection of functions to guide neural networks, since approximation theory with representation system has been well developed in Mathematics. We then prove the fundamental bound theorem, implying a quantity intrinsic to the function class itself can give information about the approximation ability of neural networks and representation system. We also provides a method for transferring existing theories about approximation by representation systems to that of neural networks, greatly amplifying the practical values of neural networks. Finally, we use neural networks to approximate B-spline functions, which are used to generate the B-spline curves. Then, we analyse the complexity of a class called $\beta$ cartoon-like functions using rate-distortion theory and wedgelets construction.

  Country:
  Genre:

Notation system allows scientists to communicate polymers more easily

#artificialintelligence

Having a compact, yet robust, structurally-based identifier or representation system for molecular structures is a key enabling factor for efficient sharing and dissemination of results within the research community. Such systems also lay down the essential foundations for machine learning and other data-driven research. While substantial advances have been made for small molecules, the polymer community has struggled in coming up with an efficient representation system. For small molecules, the basic premise is that each distinct chemical species corresponds to a well-defined chemical structure. This does not hold for polymers.


Notation system allows scientists to communicate polymers more easily

#artificialintelligence

Having a compact, yet robust, structurally-based identifier or representation system for molecular structures is a key enabling factor for efficient sharing and dissemination of results within the research community. Such systems also lay down the essential foundations for machine learning and other data-driven research. While substantial advances have been made for small molecules, the polymer community has struggled in coming up with an efficient representation system. For small molecules, the basic premise is that each distinct chemical species corresponds to a well-defined chemical structure. This does not hold for polymers.


A semi-holographic hyperdimensional representation system for hardware-friendly cognitive computing

Serb, A., Kobyzev, I., Wang, J., Prodromakis, T.

arXiv.org Artificial Intelligence

One of the main, long-term objectives of artificial intelligence is the creation of thinking machines. To that end, substantial effort has been placed into designing cognitive systems; i.e. systems that can manipulate semantic-level information. A substantial part of that effort is oriented towards designing the mathematical machinery underlying cognition in a way that is very efficiently implementable in hardware. In this work we propose a 'semi-holographic' representation system that can be implemented in hardware using only multiplexing and addition operations, thus avoiding the need for expensive multiplication. The resulting architecture can be readily constructed by recycling standard microprocessor elements and is capable of performing two key mathematical operations frequently used in cognition, superposition and binding, within a budget of below 6 pJ for 64- bit operands. Our proposed 'cognitive processing unit' (CoPU) is intended as just one (albeit crucial) part of much larger cognitive systems where artificial neural networks of all kinds and associative memories work in concord to give rise to intelligence.


Deep Neural Network Approximation Theory

Grohs, Philipp, Perekrestenko, Dmytro, Elbrächter, Dennis, Bölcskei, Helmut

arXiv.org Machine Learning

Deep neural networks have become state-of-the-art technology for a wide range of practical machine learning tasks such as image classification, handwritten digit recognition, speech recognition, or game intelligence. This paper develops the fundamental limits of learning in deep neural networks by characterizing what is possible if no constraints on the learning algorithm and the amount of training data are imposed. Concretely, we consider information-theoretically optimal approximation through deep neural networks with the guiding theme being a relation between the complexity of the function (class) to be approximated and the complexity of the approximating network in terms of connectivity and memory requirements for storing the network topology and the associated quantized weights. The theory we develop educes remarkable universality properties of deep networks. Specifically, deep networks are optimal approximants for vastly different function classes such as affine systems and Gabor systems. This universality is afforded by a concurrent invariance property of deep networks to time-shifts, scalings, and frequency-shifts. In addition, deep networks provide exponential approximation accuracy i.e., the approximation error decays exponentially in the number of non-zero weights in the network of vastly different functions such as the squaring operation, multiplication, polynomials, sinusoidal functions, general smooth functions, and even one-dimensional oscillatory textures and fractal functions such as the Weierstrass function, both of which do not have any known methods achieving exponential approximation accuracy. In summary, deep neural networks provide information-theoretically optimal approximation of a very wide range of functions and function classes used in mathematical signal processing.


New Decimal Systems - Great Sandbox for Data Scientists and Mathematicians

@machinelearnbot

We illustrate pattern recognition techniques applied to an interesting mathematical problem: The representation of a number in non-conventional systems, generalizing the familiar base-2 or base-10 systems. The emphasis is on data science rather than mathematical theory, and the style is that of a tutorial, requiring minimum knowledge in mathematics or statistics. However, some off-the-beaten-path, state-of-the-art number theory research is discussed here, in a way that is accessible to college students after a first course in statistics. This article is also peppered with mathematical and statistical oddities, for instance the fact that there are units of information smaller than the bit. You will also learn how the discovery process works, as I have included research that I thought would lead me to interesting results, but did not. In all scientific research, only final, successful results are presented, while actually most of the research leads to dead-ends, and is not made available to the reader.


Multilingual Topic Models

Krstovski, Kriste, Kurtz, Michael J., Smith, David A., Accomazzi, Alberto

arXiv.org Machine Learning

Scientific publications have evolved several features for mitigating vocabulary mismatch when indexing, retrieving, and computing similarity between articles. These mitigation strategies range from simply focusing on high-value article sections, such as titles and abstracts, to assigning keywords, often from controlled vocabularies, either manually or through automatic annotation. Various document representation schemes possess different cost-benefit tradeoffs. In this paper, we propose to model different representations of the same article as translations of each other, all generated from a common latent representation in a multilingual topic model. We start with a methodological overview on latent variable models for parallel document representations that could be used across many information science tasks. We then show how solving the inference problem of mapping diverse representations into a shared topic space allows us to evaluate representations based on how topically similar they are to the original article. In addition, our proposed approach provides means to discover where different concept vocabularies require improvement.